307 research outputs found

    Communication/Computation Overlap in MPI

    Get PDF
    This talk discusses optimized collective algorithms and the benefits of leveraging independent hardware entities in a pipelined manner. The resulting approach uses overlap of computation and communication to reach this task. Different examples are given

    Spatial Mixture-of-Experts

    Full text link
    Many data have an underlying dependence on spatial location; it may be weather on the Earth, a simulation on a mesh, or a registered image. Yet this feature is rarely taken advantage of, and violates common assumptions made by many neural network layers, such as translation equivariance. Further, many works that do incorporate locality fail to capture fine-grained structure. To address this, we introduce the Spatial Mixture-of-Experts (SMoE) layer, a sparsely-gated layer that learns spatial structure in the input domain and routes experts at a fine-grained level to utilize it. We also develop new techniques to train SMoEs, including a self-supervised routing loss and damping expert errors. Finally, we show strong results for SMoEs on numerous tasks, and set new state-of-the-art results for medium-range weather prediction and post-processing ensemble weather forecasts.Comment: 20 pages, 3 figures; NeurIPS 202

    Adaptive Routing Strategies for Modern High Performance Networks

    Full text link
    Today’s scalable high-performance applications heavily depend on the bandwidth characteristics of their commu-nication patterns. Contemporary multi-stage interconnec-tion networks suffer from network contention which might decrease application performance. Our experiments show that the effective bisection bandwidth of a non-blocking 512-node Clos network is as low as 38 % if the network is routed statically. In this paper, we propose and ana-lyze different adaptive routing schemes for those networks. We chose Myrinet/MX to implement our proposed routing schemes. Our best adaptive routing scheme is able to in-crease the effective bisection bandwidth to 77 % for 512 nodes and 100 % for smaller node counts. Thus, we show that our proposed adaptive routing schemes are able to im-prove network throughput significantly.
    • …
    corecore